Bayesian Distance Metric Learning

نویسندگان

  • Xiao Fang
  • Najim Dehak
  • James R. Glass
چکیده

This thesis explores the use of Bayesian distance metric learning (Bayes-dml) for the task of speaker verification using the i-vector feature representation. We propose a framework that explores the distance constraints between i-vector pairs from the same speaker and different speakers. With an approximation of the distance metric as a weighted covariance matrix of the top eigenvectors from the data covariance matrix, variational inference is used to estimate a posterior distribution of the distance metric. Given speaker labels, we select different-speaker data pairs with the highest cosine scores to form a different-speaker constraint set. This set captures the most discriminative between-speaker variability that exists in the training data. This system is evaluated on the female part of the 2008 NIST SRE dataset. Cosine similarity scoring, as the state-of-the-art approach, is compared to Bayes-dml. Experimental results show the comparable performance between Bayes-dml and cosine similarity scoring. Furthermore, Bayes-dml is insensitive to score normalization, as compared to cosine similarity scoring. Without the requirement of the number of labeled examples, Bayes-dml performs better in the context of limited training data. Thesis Supervisor: James R. Glass Title: Senior Research Scientist Thesis Supervisor: Najim Dehak Title: Research Scientist Acknowledgments First and foremost, I would like to thank Jim Glass and Najim Dehak for offering me the opportunity to do research in their group. I am grateful to Jim for his consideration and patience all the time, for always guiding me down the right path. I appreciate Najim's passion and brilliance. Najim's broad knowledge and thoughtful insights in this field have inspired me a lot through this thesis work. This thesis would not have been possible without them. The Spoken Language Systems group has provided me a research home in the past two years. Thank you to all the members for your energy, creativity, and friendship. The birthday celebrations, the spectrum reading seminars, and the defense suit-ups would be unforgettable memories in my life. I would like to thank all my friends, old and new, locally available and geographically separated, for accompanying and encouraging me, for sharing tear and joy with me. Lastly I would like to thank my parents and my elder brother for their love, inspiration, and support all these years. This thesis was supported by DARPA under the RATS (Robust Automatic Transcription of Speech) Program.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Active Distance Metric Learning

Distance metric learning is an important component for many tasks, such as statistical classification and content-based image retrieval. Existing approaches for learning distance metrics from pairwise constraints typically suffer from two major problems. First, most algorithms only offer point estimation of the distance metric and can therefore be unreliable when the number of training examples...

متن کامل

یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیک‌های یادگیری معیار فاصله

Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...

متن کامل

Bayesian Multitask Distance Metric Learning

We present a Bayesian approach for jointly learning distance metrics for a large collection of potentially related learning tasks. We assume there exists a relatively smaller set of basis distance metrics and the distance metric for each task is a sparse, positively weighted combination of these basis distance metrics. The set of basis distance metrics and the combination weights are learned fr...

متن کامل

Learning hypothesis spaces and dimensions through concept learning

Generalizing a property from a set of objects to a new object is a fundamental problem faced by the human cognitive system, and a long-standing topic of investigation in psychology. Classic analyses suggest that the probability with which people generalize a property from one stimulus to another depends on the distance between those stimuli in psychological space. This raises the question of ho...

متن کامل

Bayesian Neighbourhood Component Analysis

Learning a good distance metric in feature space potentially improves the performance of the KNN classifier and is useful in many real-world applications. Many metric learning algorithms are however based on the point estimation of a quadratic optimization problem, which is time-consuming, susceptible to overfitting, and lack a natural mechanism to reason with parameter uncertainty, an importan...

متن کامل

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014